Index(['PC1', 'PC2', 'PC3', 'PC4'], dtype='object')
Classification Report:
precision recall f1-score support
0 0.54 0.20 0.29 215
1 0.63 0.89 0.74 327
accuracy 0.61 542
macro avg 0.58 0.54 0.51 542
weighted avg 0.59 0.61 0.56 542
Model Components¶
1. Data Preparation¶
- Selects relevant features:
- OHLC prices
- Volume
- Moving averages
- Technical indicators
- Converts target variable to binary format (BUY/SELL)
2. Model Training¶
- Splits data into train/test sets
- Trains logistic regression model
- Implements cross-validation
3. Evaluation Metrics¶
- Classification report showing:
- Precision
- Recall
- F1-score
- Support
- Confusion matrix visualization
- ROC curve with AUC score
- Feature importance plot
4. Visualizations¶
- Interactive Plotly plots for all visualizations
- Clear titles and labels
- Color-coded results for better interpretation
- Includes:
- Confusion matrix heatmap
- ROC curve plot
- Feature importance bar chart
- Signal prediction overlay on price chart
Fitting SARIMAX model...
c:\Users\coolm\Notebooks\.venv\Lib\site-packages\statsmodels\base\model.py:607: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
Training Random Forest model...
Random Forest Classification Report:
precision recall f1-score support
0 0.96 0.97 0.97 215
1 0.98 0.98 0.98 327
accuracy 0.97 542
macro avg 0.97 0.97 0.97 542
weighted avg 0.97 0.97 0.97 542
SARIMAX and Random Forest Trading Model¶
Model Components¶
1. SARIMAX Modeling¶
- Uses close price as endogenous variable
- Uses other features as exogenous variables
- Predicts future prices
2. Random Forest Classification¶
- Uses SARIMAX predictions as additional features
- Predicts buy/sell signals
- Includes feature importance analysis
3. Visualizations¶
- SARIMAX predictions vs actual prices
- Confusion matrix
- ROC curve
- Feature importance
- Final trading signals on price chart
4. Evaluation Metrics¶
- Classification report
- ROC-AUC score
- Confusion matrix
Usage Instructions¶
Data Preparation
- Ensure DataFrame has all required columns:
- OHLC data (open, high, low, close)
- Volume
- Technical indicators (10_DMA, 30_DMA)
- Ensure DataFrame has all required columns:
Model Execution
run_combined_model(df)
Output
- Displays all relevant plots
- Prints performance metrics
- Returns model objects and predictions
Approach Benefits¶
Two-Stage Prediction
- First predicts price movements using SARIMAX
- Uses predictions to enhance signal generation
Comprehensive Analysis
- Provides detailed visualization of results
- Includes multiple performance metrics
Model Interpretability
- Feature importance analysis
- Visual representation of predictions
- Clear performance metrics
Easy Evaluation
- Multiple evaluation metrics
- Visual confirmation of predictions
- Trading signal visualization